perm filename APP2[AIM,DBL]1 blob sn#121227 filedate 1974-09-24 generic text, type T, neo UTF8
00100	.DEVICE XGP
00200	.FONT 1 "FIX25"
00300	.FONT 2 "SIGN57"
00400	.FONT 3 "SHD40"
00500	.FONT 4 "BDI25"
00600	.FONT 5 "NGB30"
00700	.FONT 6 "NGR20"
00800	.TURN ON "↓_π{"
00900	.TURN ON "⊗" FOR "%"
01000	.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
01100	.MACRO E ⊂ APART END ⊃
01200	.TABBREAK
01300	.COMPACT
01400	.EVERY FOOTING(⊗6First Draft .... {DATE},page A2.{IF PAGE = 1 THEN 1 ELSE PAGE}⊗*,)
01500	.EVERY HEADING(⊗4Synthesis of Large Programs⊗*,⊗3BEINGS⊗*,⊗4Doug Lenat⊗*)
01600	.COUNT PAGE PRINTING "1"
01700	.NEXT PAGE
     

00100	.SELECT 1
00200	
00300		⊗2APPENDIX 2.⊗* ⊗3THE BEINGS⊗*
00400	
00500	Here we present summaries of the knowledge embedded in each BEING.  Only the
00600	most important parts of each BEING are even mentioned.  First we present those
00700	BEINGS used to write CF. Next come the ones we had to add to the pool to get
00800	PUP6 to write GI and AIR.  Among the first group, we further subdivide it into
00900	(a) high-level, domain-specific, (b) low-level domain-specific, (c) ubiquitous,
01000	problem-independent, (d) high-level, problem-independent, (e) low-level,
01100	problem-independent, (f) non-BEING knowledge in variables, and (g) a few
01200	interesting demons and functions.  The additions following CF are so small we
01300	don't subdivide their descriptions.
01400	
01500	.SELECT 5
01600	
01700	    (i)	The knowledge necessary to write a concept formation program:
01800	
01900		A. High-level, domain-specific knowledge
02000	
02100	.SELECT 1
02200	
02300	⊗4CONCEPT-FORMATION⊗*
02400	The user must be aware that we are about to undertake concept formation.
02500	Inference and attention-focussing demons must be turned on.
02600	After completion of this task, PUP will be able to learn concepts.
02700	This is a specialized form of attending, learning, and doing inductive
02800	inference. It is an alternative to grammatical inference, pattern
02900	recognition, and simulated evolution. Its structure must be one of the
03000	following: classificatory concept formation, comparative concept formation,
03100	or metrical concept formation. We must make the boolean decision as to
03200	whether or not concepts may vary with time. Similarly, whether the speed
03300	of presentation of the stimuli is relevant; if so, then we must constrain
03400	the effort spent on various phases of identification. Instances may be
03500	left in view indefinitely, or may be removed right after processing;
03600	this latter case holds for CF; it means we must derive all relations
03700	(features) as soon as we see a scene. The program will have to be  just
03800	complex enough to handle conjunctive, disjunctive, or both kinds of
03900	concpts; this is another decision to make. Similarly for positive,
04000	negative, both, or neither kinds of transfer (psychological), which
04100	affects the recognition that a concept is new, and how previously
04200	learned concepts interact with the learning of new ones. We must whether
04300	to use positive, negative, or boht kinds of instances of a concept.
04400	Subject-specific behavior may be required; that is, the program may
04500	have to input a vector describing a particular individual, and its
04600	whole structure must mimic this subject. The last decision is one of
04700	adapting the program to an extended sample dialogue which the user must
04800	furnish; this will help both to check out the program writtten, and to
04900	fix various print statements and I/O formats. It is easy to call this
05000	being (.1), it has a 50-50 chance of calling* itself, it has only a
05100	0.5 chance of succeeding, but the effort to try it is moderate (.5).
05200	There is no fundamental reason for delaying its investigation (.1).
05300	It recognizes itself only through exact matching of one of seven
05400	patterns. It has sentences describing what it does, how, and why.
05500	It is unlikely (-70) to be called if some type of concept formation
05600	is already doable by PUP; if PUP wants to characterize classes, then
05700	it's very likely (88) to be called. The presence of new information
05800	delays (-60) our calling the being, since it might affect what we
05900	should do. Conversely, the absence of new information mildly (40)
06000	encourages us to go on and try it. When finished, the user must be
06100	aware that PUP has decided on a particular type of concept formation,
06200	and that it has done it. The other beings affected depend on the
06300	decisions mentioned earlier.
06400	
06500		This is an over-abundance of information. From now on, I will
06600	only give the little pieces of information which are crucial to the
06700	writing of some part of the CF system.
06800	
06900	⊗4CLASSIFICATORY CONCEPT FORMATION⊗*
07000	To do this, we must partition a domain, in an interactive "guessing"
07100	manner.
07200	
07300	⊗4COMPARITIVE CONCEPT FORMATION⊗*
07400	Same as above, but then we must partially order the equivalence classes
07500	of the partition. It is harder, also.
07600	
07700	⊗4METRICAL CONCEPT FORMATION⊗*
07800	Same as previous being, but we must also induce a metric on the
07900	partial ordering of the classes. This is even more complicated.
08000	
08100		Since we actually do classificatory CF, the beings to order
08200	a partition and to metrize an ordering weren't implemented.
08300	
08400	⊗4PARTITION A DOMAIN⊗* in a guessing, interactive manner.
08500	The partition may be only partial, it may be only weak, and, most
08600	crucially, the being must be able to do some of these: partition by
08700	accepting an element and getting its class name (guessing the name
08800	and then checking it somehow), partition by accepting a class name
08900	and getting its member elements, partition by accepting ordered
09000	pairs <element, classname>.  The fringe of conciousness demon must
09100	be activated from now on.
09200	
09300	⊗4PARTITION BY TAKE ELEMENT GET CLASS⊗*
09400	Take hold of an element (by reading, for example), and then work to
09500	get the name of the class it belongs to (by guessing, then verifying
09600	the guess, for example). Then modify the structure of the class(es)
09700	involved.
09800	
09900	⊗4PARTITION BY TAKE CLASS GET ELEMENT⊗*
10000	Take hoold of a class name, and then work to get elements it contains.
10100	Then modify the structure of the class and the element(s) involved.
10200	
10300	⊗4PARTITION BY TAKE ELEMENT AND CLASS⊗* simultaneously.
10400	Take hold of both and element and its corresponding class name, and
10500	use these to modify the structure of the partition; i.e., the class
10600	mentioned if the partition is stored by classes.
10700	
10800	
10900		⊗5B. Low-level, domain-specific knowledge⊗*
11000	
11100	⊗4RECOGNIZE CONTRADICTION⊗*
11200	It can translate (... is incompatible with ...). It is a predicate,
11300	fairly easy to write therefore. It is composed of SOMEOF the following:
11400	Probability≡0 contradiction, Probability≡1 contradiction, Probability
11500	>0 and <1 contradiction.  Since these names are fairly cryptic, some of
11600	their parts (e.g, HOW) must be printed out to help the usr choose.
11700	
11800	⊗4PROBABILITY ≡ 0 CONTRADICITON⊗*
11900	Since this is a very simple thing in our domain of concept formation,
12000	it is immediately encodable as (MEMBER arg1 arg2). That is, if arg1
12100	has probability zero of occurring in arg2, yet it does, then we have
12200	a contradiction.
12300	
12400	⊗4PROBABILITY ≡ 1 CONTRADICTION⊗*
12500	Immediately encodable as (NOT (MEMBER arg1 arg2)). If arg1 has probability
12600	one of occurring in arg2, yet it doesn't, then we have a contradiction.
12700	
12800	⊗4PROBABILITY >0 AND <1 CONTRADICTION⊗*
12900	Immediately enacodable as NIL. If arg1 might and might not occur in arg2,
13000	we can't get a contradiction just by checking its membership. Of course,
13100	the idea that this is the only way to prove contradiction is what makes 
13200	these beings domain-specific!
13300	
13400	⊗*SCENE⊗*
13500	This is a data structure, composed of four subparts. The first is
13600	a set O of objects. Next is an atom indicating the name N of the scene.
13700	Next come two lists of features, where a feature is just a predicate
13800	and its arguments. The first is the static relations S between objects.
13900	Finally we have the dynamic relations D between objects.
14000	
14100		⊗5C. Ubiquitous, problem-independent beings and functions⊗*
14200	
14300	⊗4CHOOSE FROM⊗*
14400	All its arguments must be beings, else it prints a nasty warning message.
14500	We select the best being and apply it. If it fails, we re-order the  
14600	remaining beings and apply the best one of them. Note that this new
14700	reordering may use knowledge gleaned during the earlier, unsuccessful
14800	being try.  Thus, this is a (possible) intelligent nondeterministic
14900	branch point. The intelligence lies (or fails to lie) in the comparison
15000	function, BETTER, which decides who goes next.
15100	
15200	⊗4SATISFY⊗*
15300	This is the equivalent of a pattern-directed goal statement. We ask
15400	each being, "Can you do anything matching x?".  We take the list of those
15500	answering affirmatively, and CHOOSE FROM it one being after another
15600	until the desired effects are realized. Notice that a being who said
15700	"probably" may succeed in his application and yet not effect the result
15800	we wanted, so that a trivial call on CHOOSE FROM is insufficient.
15900	The beings possible affected are just those answering affirmatively.
16000	
16100	⊗4MESSAGE⊗*
16200	This being has a main effect (AWARE USER x), hence is very frequently
16300	called.  The forgetful-user demon trims the aware user list periodically.
16400	Message looks to see if its message is on that list; if not, it inserts it
16500	and prints it out to the user. If it is, it moves the message to the front
16600	of the aware-user list and prints out nothing.
16700	
16800	⊗4DETERMINE ARG VALUES⊗*
16900	These functions take as input the name of a function, and output a
17000	description of what arguments it will ever be called with (in the existing
17100	code.) For example, it might say "arg1 will always be NAME:OF:CLASS, and
17200	arg2 will consecutively be all integers from 3 to (LENGTH SET:OF:CLASSES)".
17300	At present these work in the obvious way, looking at everything. The
17400	tremendous amout of CPU time spent in these functions indicates that I
17500	should have made special assert:lists for argument instantiations, and
17600	updated them each time a being is called in the target code.
17700	
17800	⊗4FLOW-PRECEDED⊗*
17900	This being must search through  he code to find a form matching a given
18000	pattern.  Although it is used under ten times in the total dialog, it is
18100	so costly that I've implemented it as an ask-the-user call. Work must be
18200	done here to understand why this is so inefficient, and to remedy it.
18300	
18400	⊗4FIND AND TAG⊗*
18500	This being is similar to flow-preceded, except that the pattern-matching
18600	is between two constant strings.  This is tolerably efficient in CPU time
18700	and is used heavily throughout the writing of CF.
18800	
18900	⊗4TRANSLATE⊗*
19000	The natural language front end is managed by this being. It asks each being
19100	whether it recognizes a given string. Translate then takes the "best" --
19200	the most probable -- of these as the translation, and can backtrack and
19300	reorder the remaining interpretations if it has to. If called with no
19400	argument, it examines various assert-lists to see if it can do any good.
19500	The idiom demon must be activated during the control period of this being.
19600	
19700	⊗4REINVESTIGATE DECISION⊗*
19800	This is usaully called by a demon who watches the deferred-decision
19900	assert list. We transfer the decision in question from the deferred to
20000	the undeferred decision assert list. A deferral demon will promptly
20100	react to anything on this latter list. An interesting caution: it was 
20200	necessary to inhibit all demons during the execution of this being, for
20300	reasons very similar to those leading to lock and unlock system commands.
20400	THe fact that some being might have to be demon-uninterruptable forced us
20500	to institute an entire new question asking just about this tiny point!
20600	
20700	⊗4DEFER DECISION⊗*
20800	Remove the decision from the undeferred decision assert list.
20900	Determine the situation when we must next reinvestigate this decision.
21000	This will be some predicate examing the state of the world. If this
21100	predicate is true currently, we must resolve the decision now. Otherwise,
21200	we put the decision on the deferred decision list, attached to its (new)
21300	reinvestigation predicate. Demons must be inhibited during this being's
21400	reign, to ensure that it's notions of the world are accurate upon exit.
21500	
21600	⊗4WHEN NEXT⊗*
21700	Manipulate the decision to extract the name of the variable holding
21800	information relevant to deferring the decision. Utilize this knowledge
21900	so as to keep the effects of the decision irrelevant; i.e., find the
22000	(next) situation in which they are not irrelevant.  Whoever called
22100	this being is now asserted to be aware of its results. THis is fairly
22200	complex, and the being is not called unless it is necessary. As it happens,
22300	it is called a few times for every decision to be made about every being
22400	in the target program (several hundred times).
22500	
22600	⊗4UTILIZE⊗*
22700	This being applies various knowledge "variables," starting at specific
22800	ones and moving toward very general ones, until one of them reports
22900	being able to acheive the desired goal.
23000	
23100	⊗4RESOLVE DECISION⊗*
23200	Again, all demons must be inhibited. After some preliminary searching
23300	and very trivial theroem-proving fail, this being resorts to asking the
23400	user about how to resolve a given decision.
23500	
23600	⊗4ASK USER ABOUT⊗*
23700	We determine the argument instantiations of the little piece of code
23800	we're worrying about, determine the type of decision to be made, and
23900	apply the specific knowledge variable for x-ing that type of decision.
24000	Here, we get x by examing who called this being and why. To write a
24100	specialized version of ask-user-about, we just write a standard
24200	print, read, and assign function, with the details left unspecified
24300	until the sample session is read in.
24400	
24500	⊗4BETTER⊗*
24600	This function is used to compare two beings, and decide which of them
24700	should gain control.  It evaluates their WHEN parts, and if they tie
24800	it evaluates their complexity vectors. Note that "eval" here is not
24900	trivial: each dimension of the complexity vector of a being can be a
25000	little program which examines itself, other beings, and the state of the
25100	world before deciding on a numerical answer to rEturn.
25200	
25300	⊗5Handling of User Interrupts⊗*
25400	There are several functions and beings involved in this process.
25500	Initially, the user describes how often the system is to give him the
25600	opportunity to interrupt and query it.  At each of these times, the 
25700	HANDLE USER INTERRUPT function asks the user if he wants to interrupt;
25800	if so, PROCESS USER INTERRUPT is called to do the job. In addition to
25900	asking for pieces of any being, the user may request limited simulated
26000	execution of various pieces, and may order the current being to FAIL.
26100	
26200	
26300	
26400		⊗5D. High-level, problem-independent knowledge: how to write programs⊗*
26500	
26600	⊗4SERVE⊗*
26700	Obtain information until some of it is "executable," then carry it out.
26800	The forgetful-user demon is activated. Without this top-level purpose,
26900	PUP5 sits contentedly, never wanting to accept a new task.
27000	
27100	⊗4WRITE PROGRAM⊗*
27200	The user must be made aware that PUP is about to write a program, what
27300	kind of program it is, what its name is (this will force a get-name
27400	being call), and that its type has been examined (this will cause a
27500	study-type being call). Upon exit, he must be told that PUP has completed
27600	the task, and what its new capabilities are. To wite a program, one enters
27700	a loop, broken only when several completion conditions are all true
27800	simultaneously: the top-level task is now a being, there are no undefined
27900	sections of code, there are no warnings left about the code, there is
28000	no executable information anywhere, ther is no new but unprocessed
28100	information, there are no decisions still pending (except those requiring
28200	"everything else" to be complete; e.g., the adaptation ofoutput formats
28300	using a sample session). If we do break out of the loop, we must update
28400	the list of programs written, the list of what PUP can now do, of what
28500	the user may do, we find the set of support of the top-level task and
28600	create a new file with the relevant functions and beings (which auto-
28700	matically does global initializations and then enters the top-level
28800	task instead of SERVE).     In general, of course, we won't break out,
28900	so we activate all the current demons and go on. All the body of the
29000	loop is is one CHOOSE FROM, between six alternatives: obtaining some
29100	usable info, using some usable info, filling in some function call which
29200	is currently undefined, clarifying some little piece of code known to
29300	be improbable for some specific reason, adapting some known function
29400	to conform to some specific new requirements, fixing some piece of code
29500	which doesn't work the way it claims to work. The last two of these are
29600	simply program modification and debugging, respectively! Failure of one
29700	of these six beings simply causes CHOOSEFROM to try another one; failure of
29800	a demon causes the whole WRITE PROGRAM being to fail. During its reign,
29900	the program-writing demons, deferral-demon, ad reinvestigation-demon are
30000	all activated.  Its complexity vector is dependent upon that of the
30100	being closest to the task it must perform.
30200	
30300	⊗4OBTAIN USABLE INFORMATION⊗*
30400	The WHEN part informs us that this is always undesirable (-10) but
30500	is OK (111) if there exists new but not yet usable information. All
30600	we do here is CHOOSE FROM the following four alternatives: translate
30700	something, get brand new information, analyze the implications of
30800	existing information, extract a small relevant subset of the existing
30900	information to concentrate on.
31000	
31100	⊗4USE INFORMATION⊗*
31200	This demands that some executable information exist. We select one such
31300	piece, and try to execute it. If we fail, its worth is decreased; if we
31400	succeed, it is removed from the executable info assert list.
31500	
31600	⊗4FILL IN UNDEFINED SECTION⊗*
31700	THere must be some undefined section. If so (80) we don't want any
31800	hi-priority (≤20) coding warnings still around  (-150), and we do like
31900	there to be something both undefined and known to be encodable (96).
32000	We fix a choice of what to encode, and try to acheive its encoding.
32100	If we fail, we update the difficulty of that choice, and may assert that
32200	we want some specific new information to relieve the problem.  In addition
32300	to ENCODE, this being also may call MAKE ENCODABLE and STUDY TYPE.
32400	
32500	⊗4CLARIFY IMPROBABLE SITUATION⊗*
32600	This being demands that something of mediocre priority (≤500) exist on
32700	the coding warning assert-list. It likes (51) this, and dislikes (-41)
32800	anything on the undefined section list, or anything (-200) on the encodable
32900	section list.  As always, a sentence is provided to justify each of these
33000	little beliefs. We choose the warning with the highest priority (lowest
33100	numerical weight) on the coding warning list, note that that is what PUP
33200	is working on now, and do a match to decide what type of warning it is.
33300	(i) Replace x by y in z 
33400	Here these may be nonspecific; z may be "in all code recently generated".
33500	The nature of y may cause us to include new warnings; y may mention a new
33600	data structure.
33700	(ii) x in y is undefined; probably z since r
33800	This may cause us to add to the global initialization list. It will
33900	probably cause us to ask the user what the answer is.
34000	(iii) x is a data structure but we don't know much about it
34100	We try to find out its structure, how to initialize, access, insert,
34200	delete, update it. A variant of this warning is:
34300	(iv) We find no x's  associated with data structure y
34400	Here x can point to insertions, deletions, initializations, etc.
34500	If we can't justify the lack, we try to defer the decision. Failing
34600	that, we ask the user.
34700	(v) Command if x then y
34800	This is a programmed demon; when situation x is true, we must do y.
34900	(vi) Delete all mention of x
35000	This is like a replace, but we go through the assert lists with an eye
35100	toward deleting unnecessary worries.
35200	(vii) Infinite loop in x from y to z
35300	If we can't justify this, we insert a test to break out of the loop.
35400	Justification might be that this loop is in the top-level function of
35500	the system, where we never wnat to break out.
35600	(viii) Incomprehensible (there is a "bug" in x manifesting itself as y)
35700	Never needed to write CF, so not implemented. SHould call FIX INCORRECT
35800	PIECE (which is also not in yet) or ask the user for assistance.
35900	
36000	⊗4GET NEW INFORMATION⊗*
36100	Naturally, it is not thrilled if (-68) there exists some new but 
36200	unexamined information, and it is happy (50) if there is none.
36300	The prerequisites ensure that the user is aware of what PUP wants,
36400	and if the theorem prover can't deliver it, PUP asks the user for some.
36500	If PUP asks for something general ("any task") it is because it knows
36600	precisely that this is what it wants, not out of ignorance! During
36700	execution, the specificity check demon is active; he ensures that
36800	it is indeed phrased as specifically as possible; if not, MAKE SPECIFIC
36900	will be called. This is a very uncomplicated being, and a very unpopular
37000	one to use since we should squeeze every drop of meaning out of what
37100	we have before asking for more information.
37200	
37300	⊗4EXTRACT RELEVANT SUBSET⊗*
37400	This likes (70) there to be a great deal (≥50 pieces) of new information,
37500	and dislikes (-80) it if there are under a dozen such tokens.
37600	It finds and evaluates knowledge variables to constrain what should be
37700	looked at currently. This is never called in the dialog, though it was in
37800	the protocol.
37900	
38000	⊗4ANALYZE IMPLICATIONS⊗*
38100	The WHEN part is unhappy (-60) if there is usable information already,
38200	since this being is fairly costly. It also examines the size of the
38300	new info list to see just how long this search will be. The being locates
38400	the code that will be affected, predicts the affects, and sees how
38500	desirable this is. This being is also never needed to write CF.
38600	
38700	⊗4MAKE ENCODABLE⊗*
38800	If all else fails, this being tries replacing a function by one of its
38900	alternatives or, as a last resort, by one of its generalizations.  This
39000	last resort is never needed to write CF.
39100	
39200	⊗4ENCODE⊗*
39300	Despite its key name, this being is mostly a bookkeeper! It runs around
39400	the assert lists, gathering up enough information to encode a specified
39500	little piece of code. A program schema is built up, instantiated as much
39600	as possible, printed out for the user to refer to, and passed to a highly
39700	optimized recursive auxilliary function, GETCODE. Some woorying about the
39800	arguments is done,including what they might be instantiated as. We inform
39900	the user of the code being written, and a prerequisite causes the function
40000	to become made into a being.
40100	
40200	⊗4STUDY TYPE⊗*
40300	This being accepts a being call, looks at that being's specializtions
40400	part, translates each entry into a decision, and dumps these onto the
40500	undeferred decision list. A deferral demon promtly attends to them.
40600	
40700	⊗4GET DATA STRUCTURE⊗*
40800	We call select-structure type, and use its advice to point to the
40900	proper knowledge variable. We eval it to set up the various parts of
41000	the data structure. In unclear cases, we may ask the user whether the
41100	argument really is a data structure. We ensure that this object is
41200	a being (else we make it one,) and we add warnings to the effect that
41300	there might not be any insertions or deletions; we'll worry about that
41400	much later, by another being. We know that the elements of a data
41500	structure are themselves usually data structures, so one the prerequisites
41600	says that we must try to make the arguments into data structures as well.
41700	
41800	⊗4GETCODE⊗*
41900	This is a long, special-case, recursive function. It goes through a piece
42000	of code with two jobs: to build up a list of arguments to this function,
42100	and to get names for specific instances of general beings.
42200	Part of the kludgy character is due to the fact that there are non-beings
42300	in the code: primitive forms (structure, tuple, vector, class, comment,
42400	atoms), primitive functions (read, null, and ,...), primitive variables
42500	(t, nil, 4, ...).  By keeping closely to the theoretical ideals implicit
42600	in the ideas, this function would probably become trivial.
42700	
42800	⊗4GET NAME⊗*
42900	First we study plausible names for the new specialized being. We make the
43000	user aware of them. We examine the being name carefully, to see if it
43100	was just mentioned in the same piece of code (probably then the same name),
43200	whether it's ever been mentioned and specialized, and so on. Sometimes we
43300	end up deciding not to get a new name, sometimes we pick one and just tell
43400	the user, sometimes we recommend some old specialized name. In 90% of the
43500	cases, though, we simply say "give me a name for ...". A new-name counter
43600	is maintained to ensure unique names, and this number is appended onto the
43700	end of what the user types, uunless it's an already-existing being name.
43800	The user my type ] (and usually does), indicating that PUP is to choose.
43900	PUP always informs the user what the name is at the end.
44000	
44100	⊗4MAKE NEW BEING⊗*
44200	This being has the awesome responsibility of giving life to new beings.
44300	As is the case with neurons, soldiers, and all (good) beings, no one being
44400	really does much when you look at it in isolation. Thus this one already
44500	gets the name of the being, the name of its parent (its least general
44600	generalization), how the metacodes of these tow differ, the arglist,
44700	the reason for this specialized being to exist, what extra qualities it
44800	possesses or lacks wrt its parent, etc.  It can figure out who it affects
44900	by examing its various parts, and it bases the complexity vector on
45000	that of the parent and on the changes in this new being. Thus it basically
45100	does gets, evals, and puts. It updates various assert lists, and semi-
45200	compiles the new being. One bit of knowledge says that the explicit args
45300	check may be significantly different, and the user may be queried.
45400	
45500	
45600	
45700		⊗5E. Low-level problem-independent knowledge: bits of programs⊗*
45800	
45900	⊗4PROPOSE PLAUSIBLE NAMES⊗*
46000	This being is called by GetName, and should incorporate a good model of user
46100	psychology of name-choosing. Currently, it applies Initials, Mainwords,
46200	Firstfew, and compositions of these,  to the task description sentence.
46300	
46400	⊗4SEMI COMPILE⊗*
46500	Basically, beings only lend themselves to interpreting; to help speed up
46600	this process, this being can take advantage of regularities and simplicities
46700	in anohter being's parts, and turn out a fast function to do an equivalent
46800	job.  For example, if a being doesn't enable any new demons, we needn't
46900	push the demon stack at the beginning and pop it at the end.  
47000	
47100	⊗4SELECT STRUCTURE TYPE⊗*
47200	This is a true bit of programming knowledge: if the structure is composed
47300	of several weakly-interacting pieces tied together through association with
47400	one atom, then the data structure should be a list of these atoms, with
47500	the associated structures being stored on the property lists of the atoms.
47600	If there are only a couple pieces, or they interact strongly, we should
47700	use a nested list structure instead.
47800	
47900	⊗4ELEMENT⊗*
48000	All we know about an element is that it is a member of a data structure,
48100	and that we should not be ashamed to ask the user to clarify himself if
48200	he mentions this. The ConceptFormation being should note that future
48300	refernces to element actually refer to a scene, an instance of a concept,
48400	and that  references to class refer to the model of a concept, a set of
48500	scenes.  This may change as new data structures come into existence.
48600	
48700	⊗4MODIFY STRUCTURE⊗*
48800	Generally, we will be given a typical element, and must identify the 
48900	structures it blongs to, and modify them. The precise details indicate that
49000	some subset of CONDITIONAL INSERTION, CONDITIONAL DELETION, and COMPLEX
49100	ALTERATION will be involved.
49200	
49300	⊗4GET HOLD OF⊗* by guessing and checking
49400	We must discover whether an algorithm already exists which can get arg1.
49500	If not, we try to find one which can get arg1 and some other effects. We
49600	structure the function as some of COMPUTE, GENERATE&TEST, and SEARCH. 
49700	Finally, we must decide now on the type of error recovery desired: none,
49800	backtrack, or correct-and-try-to-proceed.
49900	
50000	⊗4TAKE HOLD OF⊗* trivially
50100	We examine the flow of control through the preceding code, to decide
50200	whether arg1 has ever been SET. If so, we must ask the user whether or not
50300	to read in a new value. If no read is to be done, then this being reduces
50400	to a simple assignment, or perhaps to nothing at all. Otherwise, we read
50500	in the object, and assign its various subparts to SOME PART OF it.
50600	
50700	⊗4IS OF TYPE⊗*
50800	This being is a predicate which is too low-level and general to do much.
50900	Basically, it helps formulate a question to the user, who must explain
51000	to PUP how to construct any predicate it comes across, usually just by
51100	translating a sentence the user types in.
51200	
51300	⊗4COMPLEX ALTERATION⊗*
51400	This being is always replaced by some initializing assignments, followed
51500	by calls on MODIFY STRUCTURE for some subparts of arg1. A bit of advice
51600	is that if arg1 is unstructured, try to get it structured. If this fails,
51700	maybe what is really wanted is to modify the structure of which arg1 is
51800	a member.
51900	
52000	⊗4CONDITIONAL DELETION⊗*
52100	As above, we look at the structuring of various arguments to decide what
52200	is really supposed to be deleted from where. We check with the user,
52300	remind him of various bindings relevant to the current call, and ak him
52400	to describe (1) under what conditions we do the deletion, (2) what exactly
52500	do we delete. These are translated, analyzed in the context of deletion,
52600	and help determine the deletion piece of code.
52700	
52800	⊗4CONDITIONAL INSERTION⊗*
52900	This is similar to the preceding being. Here we also worry about
53000	whether the entity to be inserted has ever been bound. If not, we must see
53100	that it is! Often, this binding will be related to the Initialize part of
53200	the DataStructure piece of the being representing the structure we're
53300	inserting into.  Since some data structures have several similar but
53400	distinctly-named elements existing simultaneously, we have lots of little
53500	worries. After we have planned out the code, we check with the coding 
53600	warning list and add to it; e.g., any undefined initial value of a variable
53700	in a piece of code we stuck in here, will also be uninitialized here. If
53800	we later decide never to worry about the first initialization, we must
53900	not forget this one! This is a frequent source of bugs, I think.
54000	
54100	⊗4GENERATE&TEST⊗* and   ⊗4COMPUTE⊗* are not needed or implemented.
54200	
54300	⊗4SEARCH⊗*
54400	Currently, a primitive type of searching suffices. We simply execute
54500	the iteration FOREACH X IN XSET DO (TEST X).  If we're unsure, we check
54600	that XSET has the form SET OF (PLURAL X). The specializations tell us to
54700	worry about termination conditions. I suppose this being could also be
54800	used for generate-and-test.
54900	
55000	⊗4FOREACH⊗*
55100	Not surprisingly, this is the iteration being. It is an NLAMBDA function,
55200	so its arguments aren't surrounded by quotes. There are various minor
55300	decisions to make, which simplify any specialized version:
55400	there may or may not be an "until" condition; we must get it and also
55500	decide what to bind the iteration variable to and what value to return,
55600	both in case this condition is met, and in case we exhaust the space.
55700	Often, we will decide to leave some of these unspecified, put notes about
55800	them on the coding warning list, and not worry about them for a long time.
55900	
56000	⊗4TEST⊗*
56100	This being is a predicate which is oriented toward a threshold of accepta-
56200	bility, whereas ISOFTYPE is oriented toward separating cases. It either
56300	has the flavor of comparing, or of competition.  We must also decide
56400	whether the result is nominal, ordinal, or of ratio caliber. These latter
56500	two never crop up, which is why we assume the test is a predicate always.
56600	
56700	⊗4SOME PART OF⊗*
56800	We either get this simple structural function by examples (like SHaw's
56900	program) or by translating a user-supplied English sentence describing
57000	what it does. Typically, some combinations of CAR, CDR, CONS, and NIL.
57100	
57200	⊗4COMPARE⊗*
57300	We must worry about whether the result is nominal, ordinal, or ratio.
57400	We also decide whether the comparison is a JOINING FUNCTION applied to
57500	its arguments, a constant function (executed only for side effects),
57600	or a JOINING FUNCTION applied to the results of COMPAREing corresponding
57700	subparts of the arguments. This last case is both common and complicated.
57800	If neither argument is structured, this is impossible. If both are highly
57900	structured, their structures must have a nontrivial amount of correspondence
58000	in order to succeed. If only one argument is structured, this strongly
58100	suggest that the other one should be similarly structured. Often we go
58200	ahead and structure it without asking the user.
58300	
58400	⊗4BETTER⊗*
58500	We've discussed this earlier. Here we should note that when called on
58600	to write a new, specialized BETTER function, it chooses either a simple
58700	function (T, NIL) to allow an optimization (replace by CONS, replace by
58800	NCONC1), or it chooses a complex comparing function (e.g., alphorder,
58900	write-program itself!).
59000	
59100	⊗4JOINING FUNCTION⊗*
59200	This is a way of condensing results. It is typically a known function,
59300	asuch as AND, OR, PLUS, MAXIMUM, PROGN...  which is determined by (i) the 
59400	character of the result (numerical, T/F) and (ii) whether we are in
59500	a DO UNTIL loop, a DO REPEATEDLY loop, or neither of these loops.
59600	
59700	
59800		⊗5F. Programming Knowledge stored in variables⊗*
59900	
60000		To resolve an ADAPTATION decision
60100	Ask for sample dialog, symbolically run current code, modifying I/O formats.
60200	
60300		To defer any decision whose AFFECTS part is known
60400	It may translate as some detail of x; in that case, wait until some code for
60500	x already exists. The affects part may translate as The x algorithm; in this
60600	case we worry as soon as PUP begins DO-ing any coding at all of x.
60700	
60800		To defer an ALTERNATIVES decision
60900	Examine the coding tasks in all cases, and try to find some common head
61000	and/or some common tail. If there is any, try to defer the decision until
61100	after writing code for this common subtask sequence. If one alternative
61200	exactly matches the common sequence, we can temporarily assert that the
61300	decision has been made to do this simplest being.
61400	
61500	 	To terminate an AND loop
61600	The conjunction is usually between highly similar objects. Related to how
61700	to parse a sentence containing ANDs.
61800	
61900		To resolve a BOOLEAN decision
62000	Ask the user to respond Yes or No. The decision has special Yes, No, and
62100	Both parts; in each case, two of them will be evalled.
62200	
62300		To resolve a DEFINITION decision
62400	Locate the defined object, reaffirm that it is undefined, ask the user to
62500	define it, check whether it is a predicate, data structure, etc., and tell
62600	the user about such constraints.
62700	
62800		To resolve a DICHOTOMY decision
62900	Each such decision contains a little program which now is evaluated. If
63000	it succeeds, its value answers the decision; if it fails, we have to ask
63100	the user to choose alternative 1 or 2. The choice points to a little
63200	program to run, and its value is the desired resolution.
63300	
63400		To resolve a PREDICATE decision
63500	Fix the context of the predicate call, try to relate this predicate to some
63600	known tests, and, if this fails, ask the user. His response is specially
63700	processed in the context of a predicate fixation.
63800	
63900		To set up a data structure stored as a LIST STRUCTURE
64000	The initialization is a simple SETQ to an as-yet undefined value.
64100	Access is simply by mentioning the name. Insertion is by Merge:in or
64200	Merge. Deletion is by Pullout or Setdifference.
64300	
64400		To set up a data structure stored as a PROPERTY LIST STRUCTURE.
64500	Delineate the pieces to be associated with each atom, and name them.
64600	Accession is via GETP. Insertion s via a GETP, then a MERGE:IN, then a
64700	PUT. Deletion is via a GETP, then a PULLOUT, then a PUT. Initialization
64800	is a PUT of a SETQ to an as-yet undefined value.  Each named substructure
64900	must itself become a data structure, typically a simple list structure.
65000	
65100		To resolve a SOMEOF decision
65200	The user must be sufficiently informed about the choices. He types back
65300	a non-null, ordered subset of those choices. We examine them to see if
65400	they should be enclosed in a PROGN or in a REPEATEDLY, and whether we
65500	would like to see something structured in a certain way.
65600	
65700		To resolve a SUBSETOF decision
65800	Similar to above, but no fancy investigation, just string the choices
65900	together in a PROGN.
66000	
66100		To defer any decision whose WHEN part is provided
66200	We transform "before deciding firmly how to ←x ←←y" into something like
66300	"member (cons detail-of-$x-ing $y) doing-pup-list". We also translate
66400	Before any ←←x routine is finalized,  After ←←x routines are finalized,
66500	and similar phrases. These must always evaluate T or NIL.
66600	
66700	
66800		⊗5G. Demons and functions of interest⊗*
66900	
67000	 LIST:JOIN,   MERGE:IN,   PULLOUT
67100	These rather standard functions are gicen a tiny bit of advice: if their
67200	"element" is more like a sublist than an element of their "list", then
67300	they assume that what was meant was append or setidifference, not cons
67400	or merge or remove.
67500	
67600	 LONG NAME DEMON
67700	If any name becomes unwieldy, he notices it and asks the user for a
67800	nickname.
67900	
68000	 PERMIT DETAILED DECISION
68100	Implicit near the beginning of each decision, PUP called this function.
68200	It asks the user if he wnats more details, and if so it gets them and
68300	prints them out.
68400	
68500	 STRUCTURE INDUCING DEMON
68600	If the object is a being already, special considerations apply.
68700	If the object and all arg values are ill-defined, we decide not to do
68800	any structuring. Otherwise, we investigate the effects of structuring 
68900	the object into the pieces specified in the args. If there is no problem,
69000	and if the user consents, we tack the appropriate Replace messages onto the
69100	coding warning list (with a high priority). We activate Long Name Demon.
69200	
69300	⊗5Natural Language Translation⊗*
69400	We have already discussed the TRANSLATE being and the basic way of handling
69500	natural language input.  Several beings exist primarily for this purpose;
69600	RECOGNIZE:CONDITIONAL,  RECOGNIZE:CONJUNCTION,   RECOGNIZE:EQUALITY,
69700	RECOGNIZE:FUNCTION:RETURNS,  RECOGNIZE:INCLUSION,  RECOGNIZE:LITERALS,
69800	RECOGNIZE:SET:RELATIONS,   RECOGNIZE:SOME:MEMBER,   ADD:DEFINITION,
69900	ADJECTIVE:HANDLER,   REPEATEDLY.   Also, there are several functions
70000	related to translation: e.g., UNGERUNDIFY, PLURAL, OPPOSITE.
70100	All these are straight-forward, and their task is obvious from their name.
     

00100	
00200	.SKIP TO COLUMN 1
00300	.SELECT 1
00400	
00500	    ⊗5(ii) The increment of knowledge necessary to write GI⊗*
00600	
00700	⊗4APPLY  RULE⊗*  This  BEING  accepts  two  arguments,  which must be
00800	interpretable as a string and a rule, respectively.   The  string  is
00900	compared  against  the  left side of the rule and, if applicable, the
01000	change indicated by the right side is executed on the string.  It  is
01100	immediately  encodable ifthe user wishes all possible apllications of
01200	the rule to the string; else  a  more  specialized  version  must  be
01300	synthesized.
01400	
01500	⊗4CONSTRAIN⊗*  Try  first to decide if it is meaningful for the given
01600	structure to be constrained, and if so, how.  Next  see  whether  any
01700	other  BEINGs  can  help.   Finally,  ask  the  user  to  specify any
01800	constraints he can think of.  For example, can a list structure  grow
01900	indefinitely,  or can we use some fancy programming trick because the
02000	size stays small.
02100	
02200	⊗4GRAMMATICAL INFERENCE⊗* Needless to  say,  this  is  a  high-level,
02300	domain-specific BEING! It must infer grammars from exemplary strings;
02400	it knows this to be the only reasonable g.i. paradigm. If  it  fails,
02500	some    genralizations    are   inductive   inference,   enumeration,
02600	problem-solving;  some  alternatives  are   concept   formation   and
02700	simulated  evolution.  There are many minor decisions, similar to the
02800	concept formation decisions (e.g., examine a sample GI-user  dialogue
02900	to finalize the printing formats). The major decision is dichotomous:
03000	whether our new specialized BEING should retain the ability to  input
03100	the  type  of  grammar  and vary its strategy accordingly, or whether
03200	only one fixed type of grammar (e.g., context-free)  will  always  be
03300	used,  and may be "built in" to the code. The result of this decision
03400	is to pass control to one or the other of the following two BEINGS.
03500	
03600	⊗4INFER MULTI-CLASS GRAMMARS⊗* We read in the type  of  grammar,  and
03700	then  call  the appropriate specialist for that type.  Thus we have a
03800	big switch here.
03900	
04000	⊗4INFER   FIXED-CLASS   GRAMMARS⊗*   This   routine   determines   at
04100	program-synthesis  time  what the class is going to be, and thus will
04200	be a fixed call to one of the following  four  BEINGS.  The  speedups
04300	will arise from using the constraints on the rules.
04400	
04500	⊗4INFER  PHRASE-STRUCTURE GRAMMARS⊗* There are no rule constraints in
04600	a type 0 grammar; each half of the rule is  viewed  as  an  arbitrary
04700	list    of    letters.      When    a    TEST   is   indicated,   the
04800	fringe-of-conciousness demon must report it  is  thinking  of  PARSE.
04900	The  left and right sides of a rule will be destructive operations on
05000	the data representation of a rule.
05100	
05200	⊗4INFER CONTEXT-SENSITIVE GRAMMARS⊗* We  shall  only  report  on  the
05300	differences  between  the  INFER...  BEINGS.  This one knows that the
05400	right side of the rule must be at least as long  as  the  left  side.
05500	This  will  be  used  as a pruning heuristic when proposing plausible
05600	rules.
05700	
05800	⊗4INFER CONTEXT-FREE GRAMMARS⊗* Grammars of type 2 have as their left
05900	side  a single nonterminal. Further simplifications can occur by only
06000	working toward a Griebach Normal Form or Chomsky Normal Form grammar,
06100	although from the standpoint of inference energy these are harder.
06200	
06300	⊗4INFER  REGULAR  GRAMMARS⊗*  A  type three grammar has a unit on the
06400	left and a  pair  of  them  for  a  right  side  (one  terminal,  one
06500	nonterminal). This is a very powerful pruning heuristic.
06600	
06700	⊗4MAJOR  MODIFY  STRUCTURE⊗*  The  old,  simple "insert,delete,alter"
06800	paradigm of modification was no longer sufficicient. This BEING heads
06900	a  whole  complex  of  modify  BEINGS,  including  the old one as the
07000	low-level workhorse primitive. Here is a sketch of the organization:
07100			MAJOR MODIFY STRUCTURE
07200			|	|	| MODIFY UNTIL | MODIFY SOME
07300			|	|	| 	THE ORIGINAL MODIFY STRUCTURE
07400	BEING So this top-level modifier calls some subset of the three lower
07500	modification BEINGS. Later, we will add a fourth alternative, EXAMINE
07600	DATA STRUCTURE, to aid in writing the AIR program.
07700	
07800	⊗4MODIFY SOME⊗* We determine a set S and a predicate P  at  synthesis
07900	time.  At  run  time,  we  map  through  S  and apply P; all elements
08000	responding  positively  are  modified  (using  the  original   MODIFY
08100	STRUCTURE.) The decisions about P and S are easily deferrable.
08200	
08300	⊗4MODIFY   UNTIL⊗*   This   BEING  is  simply  an  order  to  compose
08400	REPEATEDLYπ.MODIFY_STRUCTURE.  The former BEING bears  the  brunt  of
08500	the responsibility of the interface.
08600	
08700	⊗4PARSE⊗* Attempt to parse a string from the current set of rules, by
08800	reversing each rule and composing their applications.  We  decide  at
08900	synthesis  time  whehter  or not to maintain a parse tree, whether or
09000	not to maintain a list of the rules used during the parse, whether we
09100	stop  parsing  at  any  legal string or only at "S," whether we parse
09200	forwards or backwards or both, how deeply in each direction (this  is
09300	always  deferred  until  much  later),  whether  one  direction after
09400	another or (simulated) simultaneously. We look for the data structure
09500	part  of the BEING representing a rule, and ask him about constraints
09600	on the rule, and about how to  destructively  recover  the  left  and
09700	right sides separately.
09800	
09900	⊗4PARSE  BACKWARD⊗*  This  BEING  is  given  two strings and a set of
10000	rules; the task is to apply anti-rules to the target string until  it
10100	becomes  the  initial  string.  This is typically done breadth-first.
10200	Special modifications must be made  if  there  is  a  parse  tree  to
10300	maintain,  if  a set of rules used must be maintained, etc. The basic
10400	body is a nest of FOREACH calls (∀rule, ∀way of applying  therule  to
10500	the  string,  recurse). To avoid infinite recursion, we must supply a
10600	third argument, the depth to which we compose these anti-rules before
10700	we  give  up.  When  calling  itself  recursively,  this  level  will
10800	bedecremented.
10900	
11000	⊗4PARSE FORWARD⊗* This BEING is analogous to the previous one,  using
11100	rules  themselves instead of anti-rules. Notice how clearly the place
11200	to insert searching heuristics is marked out for us (althoug none are
11300	present.)
11400	
11500	⊗4STRING⊗*  This  is  a  structure  whose parts are a name, a list of
11600	letters, a set of comments.  It is advisable to use  list  structures
11700	rather  than  property  lists  to  represent strings, since they will
11800	probably only be accessed by one of their three  parts.   In  the  GI
11900	program,   we   don't  use  STRING  itself,  but  rather  we  mention
12000	UNCOMMENTED  STRING,  which  causes  this  BEING  to  create  a   new
12100	specialized version of itself, sans the third, comments part.
     

00100	
00200	    ⊗5(iii) The increment of knowledge necessary to write AIR⊗*
00300	
00400	⊗4EXAMINE STRUCTURE⊗* This is another one of the  parts  of  a  major
00500	MODIFY  structure.  If the fringe of conciousness demon can't come up
00600	with a reasonable matching function, one is selected now.  The  basic
00700	body  says to do PATTERN MATCH, using this match function, and convey
00800	the results to the caller (who may be the user.) The inputs are  thus
00900	a  pattern,  a  data  structure  name, and possibly a hint to a match
01000	function.
01100	
01200	⊗4PATTERN MATCH⊗* This existed as a system function earlier, but  for
01300	AIR  it  is  necessary  to  write  a  tailored  pattern-matcher.   In
01400	particular, AIR demands that we strip away the common head  and  tail
01500	from  both pattern and expression, and then compose the two remaining
01600	pieces into the left and right sides of a  new  plausible  rule,  and
01700	then  check that this conforms with the constraints on rules. This is
01800	certainly different from the type of match needed by CF!  Notice that
01900	we had to add the eliminate common head/tail functions to our list of
02000	system primitive functions.